2,592 research outputs found

    A Measure of the Proportion of Treatment Effect Explained by a Surrogate Marker

    Full text link
    Randomized clinical trials with rare primary endpoints or long duration times are costly. Because of this, there has been increasing interest in replacing the true endpoint with an earlier measured marker. However, surrogate markers must be appropriately validated. A quantitative measure for the proportion of treatment effect explained by the marker in a specific trial is a useful concept. Freedman, Graubard, and Schatzkin (1992, Statistics in Medicine 11, 167–178) suggested such a measure of surrogacy by the ratio of regression coefficients for the treatment indicator from two separate models with or without adjusting for the surrogate marker. However, it has been shown that this measure is very variable and there is no guarantee that the two models both fit. In this article, we propose alternative measures of the proportion explained that adapts an idea in Tsiatis, DeGruttola, and Wulfsohn (1995, Journal of the American Statistical Association 90 , 27–37). The new measures require fewer assumptions in estimation and allow more flexibility in modeling. The estimates of these different measures are compared using data from an ophthalmology clinical trial and a series of simulation studies. The results suggest that the new measures are less variable.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/65535/1/j.0006-341X.2002.00803.x.pd

    Mixture cure model with random effects for the analysis of a multi-center tonsil cancer study

    Full text link
    Cure models for clustered survival data have the potential for broad applicability. In this paper, we consider the mixture cure model with random effects and propose several estimation methods based on Gaussian quadrature, rejection sampling, and importance sampling to obtain the maximum likelihood estimates of the model for clustered survival data with a cure fraction. The methods are flexible to accommodate various correlation structures. A simulation study demonstrates that the maximum likelihood estimates of parameters in the model tend to have smaller biases and variances than the estimates obtained from the existing methods. We apply the model to a study of tonsil cancer patients clustered by treatment centers to investigate the effect of covariates on the cure rate and on the failure time distribution of the uncured patients. The maximum likelihood estimates of the parameters demonstrate strong correlation among the failure times of the uncured patients and weak correlation among cure statuses in the same center. Copyright © 2010 John Wiley & Sons, Ltd.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/79409/1/4098_ftp.pd

    Optimizing Dynamic Predictions from Joint Models using Super Learning

    Full text link
    Joint models for longitudinal and time-to-event data are often employed to calculate dynamic individualized predictions used in numerous applications of precision medicine. Two components of joint models that influence the accuracy of these predictions are the shape of the longitudinal trajectories and the functional form linking the longitudinal outcome history to the hazard of the event. Finding a single well-specified model that produces accurate predictions for all subjects and follow-up times can be challenging, especially when considering multiple longitudinal outcomes. In this work, we use the concept of super learning and avoid selecting a single model. In particular, we specify a weighted combination of the dynamic predictions calculated from a library of joint models with different specifications. The weights are selected to optimize a predictive accuracy metric using V-fold cross-validation. We use as predictive accuracy measures the expected quadratic prediction error and the expected predictive cross-entropy. In a simulation study, we found that the super learning approach produces results very similar to the Oracle model, which was the model with the best performance in the test datasets. All proposed methodology is implemented in the freely available R package JMbayes2

    Modeling intra-tumor protein expression heterogeneity in tissue microarray experiments

    Full text link
    Tissue microarrays (TMAs) measure tumor-specific protein expression via high-density immunohistochemical staining assays. They provide a proteomic platform for validating cancer biomarkers emerging from large-scale DNA microarray studies. Repeated observations within each tumor result in substantial biological and experimental variability. This variability is usually ignored when associating the TMA expression data with patient survival outcome. It generates biased estimates of hazard ratio in proportional hazards models. We propose a Latent Expression Index (LEI) as a surrogate protein expression estimate in a two-stage analysis. Several estimators of LEI are compared: an empirical Bayes, a full Bayes, and a varying replicate number estimator. In addition, we jointly model survival and TMA expression data via a shared random effects model. Bayesian estimation is carried out using a Markov chain Monte Carlo method. Simulation studies were conducted to compare the two-stage methods and the joint analysis in estimating the Cox regression coefficient. We show that the two-stage methods reduce bias relative to the naive approach, but still lead to under-estimated hazard ratios. The joint model consistently outperforms the two-stage methods in terms of both bias and coverage property in various simulation scenarios. In case studies using prostate cancer TMA data sets, the two-stage methods yield a good approximation in one data set whereas an insufficient one in the other. A general advice is to use the joint model inference whenever results differ between the two-stage methods and the joint analysis. Copyright © 2008 John Wiley & Sons, Ltd.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/58565/1/3217_ftp.pd

    Joint partially linear model for longitudinal data with informative drop‐outs

    Get PDF
    Peer Reviewedhttps://deepblue.lib.umich.edu/bitstream/2027.42/136540/1/biom12566.pdfhttps://deepblue.lib.umich.edu/bitstream/2027.42/136540/2/biom12566_am.pdfhttps://deepblue.lib.umich.edu/bitstream/2027.42/136540/3/biom12566-sup-0001-SuppData.pd

    Analysis on binary responses with ordered covariates and missing data

    Full text link
    We consider the situation of two ordered categorical variables and a binary outcome variable, where one or both of the categorical variables may have missing values. The goal is to estimate the probability of response of the outcome variable for each cell of the contingency table of categorical variables while incorporating the fact that the categorical variables are ordered. The probability of response is assumed to change monotonically as each of the categorical variables changes level. A probability model is used in which the response is binomial with parameters p ij for each cell ( i , j ) and the number of observations in each cell is multinomial. Estimation approaches that incorporate Gibbs sampling with order restrictions on p ij induced via a prior distribution, two-dimensional isotonic regression and multiple imputation to handle missing values are considered. The methods are compared in a simulation study. Using a fully Bayesian approach with a strong prior distribution to induce ordering can lead to large gains in efficiency, but can also induce bias. Utilizing isotonic regression can lead to modest gains in efficiency, while minimizing bias and guaranteeing that the order constraints are satisfied. A hybrid of isotonic regression and Gibbs sampling appears to work well across a variety of scenarios. The methods are applied to a pancreatic cancer case–control study with two biomarkers. Copyright © 2007 John Wiley & Sons, Ltd.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/56130/1/2815_ftp.pd

    Regression inference for multiple populations by integrating summary-level data using stacked imputations

    Full text link
    There is a growing need for flexible general frameworks that integrate individual-level data with external summary information for improved statistical inference. This paper proposes an imputation-based methodology where the goal is to fit an outcome regression model with all available variables in the internal study while utilizing summary information from external models that may have used only a subset of the predictors. The method allows for heterogeneity of covariate effects across the external populations. The proposed approach generates synthetic outcome data in each population, uses stacked multiple imputation to create a long dataset with complete covariate information, and finally analyzes the imputed data with weighted regression. This flexible and unified approach attains the following four objectives: (i) incorporating supplementary information from a broad class of externally fitted predictive models or established risk calculators which could be based on parametric regression or machine learning methods, as long as the external model can generate outcome values given covariates; (ii) improving statistical efficiency of the estimated coefficients in the internal study; (iii) improving predictions by utilizing even partial information available from models that uses a subset of the full set of covariates used in the internal study; and (iv) providing valid statistical inference for the external population with potentially different covariate effects from the internal population. Applications include prostate cancer risk prediction models using novel biomarkers that are measured only in the internal study
    corecore